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ABSTRACT 


Tins  is  the  second  report  of  an  investigation  to  determine  how  implicit  parallelism 
in  programs  written  in  compiler  languages  can  be  recognized  and  exploited  by 
machines  with  highly  parallel  organizations.  An  algorithm  is  described  which 
identifies  the  complete  serial  ordering  among  parts  of  a  program  based  on  the 
input-output  sets  of  these  parts,  the  oidering  given  by  the  programmer,  and  any 
known  essential  order  among  the  program  parts.  The  algorithm  is  proved  and  a 
demonstration  given  that  a  minimum  number  of  comparisons  of  input-output  sets 
are  made.  Application  01  the  parallel  recognition  procedure  to  subroutines,  loops, 
conditionals,  recursive  subroutines,  and  serial  input-output  device  calls  is  ex¬ 
plained.  The  effect  of  particular  .matures  of  several  compiler  languages  on  paral¬ 
lelism  are  discussed.  These  features  include  loops,  transfers  of  control,  con¬ 
ditionals,  and  conditional  sequences.  Requirements  for  replacing  iterative  loop 
control  by  parallel  paths  of  control  are  given.  Alternative  algorithms  for 
recognizing  essential  ordering  are  suggested  which  can  be  executed  more  effec¬ 
tively  on  a  highly  parallel  machine.  Application  of  the  given  algorithm  to  the 
syntactic  definition  of  a  context  ■  free  language  is  also  considered. 


ii 


..  — 


PF’RSsas; 


W  \ 


V 


CONTENTS 


Page 

ABSTRACT .  ii 

INTRODUCTION .  1 

ALGORITHM  FOR  DETECTING  ESSENTIAL  SERIAL  ORDER  ....  3 

Definitions .  6 

Proof  of  Algorithm .  8 

Proof  of  Minimal  Comparisons .  13 

ANALYSIS  OF  FORMAL  PROGRAM  STRUCTURES .  16 

Subroutines .  16 

Loops .  16 

Corditionals .  17 

Recursive  Subroutines .  18 

Serial  Input>Output  Calls .  19 

PROGRAMMING  LANGUAGE  FEATURES  EFFECTING  PARALLELISM  20 

Loops .  20 

FORTRAN  DO  Statement .  21 

ALGOL  FOR  Statement .  23 

COBOL  PERFORM  Stateme..' .  24 

Unconditional  Transfers .  26 

Conditionals .  27 

Sequence  of  Conditionals .  27 

Duration  of  Definition  of  an  Instance .  28 

RELATED  INVESTIGATIONS .  30 

Parallel  Application  of  the  Algorithm .  30 

Parallelism  in  Language  Syntactic  Definition .  30 

PROGRAM  FOR  THE  NEXT  INTERVAL .  34 

BIBLIOGRAPHY .  36 

DISTRIBUTION  LIST .  36 

iii 


ILLUSTRATIONS 

Page 

Table 

I  Reduction  of  Syntactic  Classes .  32 


ittgure 

la  Algorithm  for  Essential  Order  Detection .  4 

lb  Symbology  for  Algorithm .  5 

2  Graph  for  Conditionals .  18 

3  QS  Graph  for  Analysis  of  Recursive  Subroutines .  18 

4  Serial  Input-Output  Calls .  19 


•t  . 


f 

INTRODUCTION 


The  object  of  this  study  is  to  detect  instances  of  para’lelism  implicit  in  programs 
written  in  compiler  programming  languages.  The  rn>  hod  chosen  is  to  recognize 
the  essential  partial  ordering  between  program  parte  since  only  parts  which  are 
not  essentially  ordered  can  be  executed  concurrently 

In  this  report  an  algorithm  for  forme]  atialysis  of  prr  rams  is  presented  and 
proved  which  yields  all  instances  of  implicit  parallel.  ;m  between  program  parts 
based  on  input-output  set  intersections.  Any  initiall'  Known  essential  ordering 
is  used.  The  number  of  input  output  set  comparisons  is  minimal.  At  most,  two 
consecutive  iterations  of  a  loop  are  necessary  to  determine  the  essential  order 
for  all  iterations  of  the  loop.  Only  one  iteration  need  be  analyzed  for  intra-loop 
essential  order.  T'ne  inter-loop  essential  order  is  determined  by  using  both 
iterations.  Sufficiency  of  this  analys’fi  is  shown  through  application  to  language 
independent  formal  translator  structures,  including  subroutines,  loops, 
conditionals,  recursive  subroutines,  and  serial  input  or  output  calls. 

Special  features  of  particular  programming  languages  affecting  implicit 
recognition  of  essential  ordering  include  loops,  unconditional  transfers,  con¬ 
ditional  statements,  and  parallel  evaluation  of  a  sequence  of  conditionals.  The 
loop  statements  yield  potentially  the  greatest  opportunity  for  parallelism.  Con¬ 
ditions  for  replacing  iterative  control  by  a  number  of  parallel  paths  of  control 
are  given.  Unconditional  transfers  may  create  loops  or  cross  boundaries  of 
scopes  of  variables.  Data  dependent  conditions  are  a  principal  cause  of  essen¬ 
tial  ordering.  The  duration  of  definition  of  an  instance  of  a  variable  provides 
essential  information  for  efficient  memory  allocation. 

Alternative  algorithms  which  can  be  executed  in  parallel  to  achieve  results 
comparable  to  the  main  algorithm  are  suggested.  A  method  is  indicated  for 
reducing  the  complexity  of  syntactic  definition  in  context-free  languages  by 
establishing  classes  of  productions  which  can  be  recognized  in  parallel. 

Moat  present  programming  languages  presume  that  programs  are  to  be  written 
ae  a  sequence  of  Instructions.  This  permissible  sequence,  while  it  contains  the 
essential  ordering  (1.  e. ,  it  computes  each  value  before  that  value  is  used),  also 
contains  much  extraneous  order  <1.  e  ,  it  orders  computations  for  which  the 
order  is  completely  immaterial).  In  the  previous  report1  we  gave  an  algorithm 
which  detects  the  essential  ordering  given  a  permissible  ordering.  In  this 
report  we  extend  the  algorithm  to  permit  detection  of  eseentlel  ordering  given 
a  consistent  combination  of  essential  and  psrmisaible  ordering. 
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In  order  to  describe  the  algorithm,  the  meaning  of  some  terms  should  be  given: 

A  process  is  a  transformation  which  generates  a  finite  set  of  outputs 
from  a  finite  set  of  inputs. 

An  output  is  the  information  possibly  written  into  a  >-egister  drying 
a  process. 

An  input  is  the  information  contained  in  a  regist  r  at  time  of  access  by 
a  process. 

A  program  is  a  finite  set  of  processes  which  can  be  partially  ordered 
by  their  input -output  set  intersections. 

A  process  execution  is  the  application  of  the  process  transformation 
to  its  input  set  to  produce  its  output  set. 

An  ordered  pair  of  inputs  and  outputs  will  be  identified  with  each  process.  For 
process  Pj,  this  pair  will  be  represented  as  (L,  0->.  It  will  be  assumed  that  all 
outputs  are  unique,  that  is,  every  time  a  register  is  written  in.o,  a  new  name  is 
created.  This  is  done  to  keep  separate  the  recognition  of  implicit  parallelism  in 
names  from  the  potentially  many-to-one  mapping  of  names  into  registers. 

The  following  relations  between  process  pairs  are  used  in  this  report: 

'rjj  Pj  must  precede  P^  is  given. 

QS^  Pj  may  precede  P^  is  given. 

T^  Pj  must  precede  P^. 

Pj  must  directly  precede  P^. 

If  neither  T^  nor  T*,  then  processes  Pj  and  P^  can  be  executed  in  either  order  or 

concurrently.  0T,  0S,  T,  and  S  are,  respectively,  the  sets  of  all  true  Tj[,  S* 
11  o  o  k 

Tk,  and  S^.  The  algorithm  uses  the  given  QT  and  QS  to  produce  T  and  S. 

A  graphical  representation  of  the  effect  of  applying  the  algorithm  was  presented  in 
the  first  report.1  This  representation  is  still  appropriate  for  the  revised  algorithm 
with  the  following  substitution.  Each  relation  Pj  R  P^  was  labeled  as  a  directed 
R-arc  from  process  i  to  process  k.  Replace  each  R^rS.  To  distinguish  between 
these  S  arcs  and  the  S  arcs  used  in  the  graphical  illustrations  in  the  first  report, 
note  that  the  reference  process  pair  being  analyzed  partitions  S  into  three  disjoint 
sets:  process  pairs  already  analyzed,  the  reference  process  pair,  and  process 
pairs  to  be  analyzed.  Consequently,  there  is  no  need  for  separate  symbols.  The 
graphs  used  as  illustrations  in  this  report  consider  only  qS  and  T. 
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ALGORITHM  FOR  DETECTING  ESSENTIAL 
SERIAL  ORDER 

This  section  describes  the  algorithm  (Figures  la  and  lb)  for  detection  of 
essential  serial  ordering  of  processes  from  their  input  and  output  sets. 

From  previous  partial  analysis  or  explicit  indicators,  essential  order  is 
sometimes  known  to  exist  between  processes  of  a  program.  Consequently,  the 
algorithm  of  the  previous  report1  has  been  extended  to  include  any  initially  known 
essential  order  among  processes  as  parameters  to  the  algorithm.  Therefore, 
tne  corresponding  input- output  set  comparisons  are  avoided  during  the  algorithm. 
When  no  essential  serial  order  is  known  initially,  the  algorithm  Is  equivalent  to 
trie  previous  one  *  The  extended  algorithm  together  with  formal  definitions  of 
its  parainete'-s  and  the  relations  among  them  are  now  given.  A  proof  of  the 
algorithm  is  also  provided.  The  number  of  input-output  set  comparisons  made 
by  the  algc  ithm  is  shown  to  be  minimal.  The  reader  who  does  not  wish  to 
engage  in  the  details  of  the  proof  can  obtain  the  essence  of  the  algonthi  i  and 
minimal  comparison  argument  from  the  definitions  and  subsection  introductions. 


_  i 


Figure  la.  Algorithm  for  Essential  Order  Detection 
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PARA  ME. 


S  :  initially  given  permissible  ordering  relation 
qT  :  initially  known  essential  ordering  relation 
N  :  number  of  processes  in  program 
I  :  input  set  for  process  j 
O  :  output  set  for  process  j 

jyjS:  coveri  .g  relation  for  the  complete  essential  ordering 

T:  the  complete  essential  ordering  relation 
M 

SYMBOLS 

<t  :  the  empty  set 
i"l  :  set  intersection 

v.  A,  —  are  respectively  the  binary  operators  "replaced  by", 

"logical  inclusive  or",  "logical  and",  and  "logical  complement’ 
given  in  in:reasing  binding  order 


SUBSCRIPTS  AND  SUPERSCRIPTS 


For  any  relation 


.  y - 

mKk_ 


— »■  Indicates  transitive  closure  of  R 

m 

— «•  indexes  the  predecessor  process 
--•»  indexes  the  successor  process 


> - ►  indicates  the  iteration  of  the  algorithm 

which  produces  R;  m  does  not  appear 
in  the  algorithm  description 

R*  ■  true  if  and  only  if  R,  and  no  assignment  lias  been  made  into  R  , 
n  OK  j  K 

or  the  last  assignment  into  R^  was  true 


Figure  lb.  Symbology  for  Algorithm 


'v^JW  , 


DEFINITIONS 
Definition  1.  Process,  P^. 

A  process  P^  is  an  ordered  pair  of  sets  (L,  O-).  I  is  called  the  input  set  for 
J  •  J  J 

Pj.  Oj  is  called  the  output  set  for  Pj. 

Definition  2.  Program,  P. 

By  a  program  P  is  meant  a  finite  set  of  processes  (P  1  (j  =  1,  2,  , ,  , ,  N),  for 
which  the  intersection  of  the  input  and  output  sets  can  be  used  to  define  a  st-  ongly 
anti-symmetric  relation.  That  is,  P  can  be  ordered  so  that  for  any 
P^,  P^  €  p,  O  0  I  /  0  implies  l  <  k. 


Definition  3.  R.  ,  the  arc  from  P.  to  P,  . 

—  -  k  i  k 

For  any  relation  R  c  (px  P)  we  will  write  R*  if  and  only  if  P.,  P  t  P  and  R 

*”  K  IK 

relates  P.  to  P.  in  that  order, 
l  k 

Definition  4.  tR,  the  transitive  closure  of  R. 

♦ 

For  any  relation  RC(PXP)we  will  write  'R  to  mean  that  relation  such  that 

4r/  if  and  only  if  there  is  a  sequence  of  R  arcs:  r!  ,  R^‘,  ....  RJ""‘,  RJ“. 

11  *  3i  3t  3.  4 


Note  that  R  is  always  transitive 


Definition  5.  T,  the  essential  serial  ordering. 

The  relation  T  c  (PX  P)  is  the  essential  serial  ordering  among  the  processes 
of  P,  This  order  is  imposed  by  the  input-output  set  relation.  That  is,  'or  any 
P.,  P,  e  P,  T*  if  and  only  if  there  is  a  sequence  O.  fi  I.  4  0,  O  0.  r  4  0, 

IKK  *  3 1  3 1  3g 

. . .  ,  O.  fi  L  4  0.  Thus  T  is  the  transitive  closure  of  the  input-output  set 
3»  ** 

relation.  Then  T  is  transitive  and  since  the  input-output  set  relation  is  strongly 
anti-symmetric,  so  is  T. 

Definition  6,  S,  the  cover  for  the  essential  serial  order. 

S  is  the  covering  relation  for  T.  'Ihat  is,  for  any  Pj,  P^  €  P,  if  and  only  if 

T,*  and  there  is  no  P  €  P  such  that  both  t|  and  Tu.  Note  that  lS  *  T,‘ 
k  ]  1  4 
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Definition  7.  T,  the  known  essential  serial  order. 

-  o 

The  relation  0T  is  any  subset  of  the  relation  T. 

Definition  8.  S,  the  given  permissible  order. 

The  relation  is  the  given  ordering  of  the  processes  in  P  supplied  by  the 
programmer.  S  is  any  strongly  anti-symmetric  relation  S(PX  P)  such  that 

t  ^  t  >it 

Tc  S  and  Tc  (  T  fl  S).  Note  that  T  satisfies  the  requirements  for  S. 

—  o  o  —  oo  o 

Definition  9.  R,  the  relation  R  after  m  iterations. 

- m 

For  any  relation  R.  we  will  write  R  to  mean  the  value  of  that  relation  after 
th  m 

the  m  iteration  of  the  "for  i"  loop  (steps  3  to  17)  in  the  algorithm. 

Convention  1.  N,  k.  l,  and  M. 

Hereafter  we  will  write  N  to  mean  the  nu  ber  of  processes  in  P;  k  and  i  will 

3t 

mean  respectively  the  values  of  k  and  i  during  the  (m  +  1)  iteration  of  the 
"for  f.  '  loop;  because  N  is  finite  (Definition  2)  and  the  only  loops  in  the  algorithm 
are  at  steps  1  and  2,  the  algorithm  terminates  in  a  finite  number  of  steps  and 
we  will  write  M  to  mean  the  total  number  of  iterations  of  the  "for  i"  loop. 

Definition  10,  mC,  the  compared  process  pair  relation. 

We  will  write  C  (1  *  m  *  M)  to  mean  that  relation  such  that  C®  if  and  only  if 
m  m  h 

Pg,  Ph  €  P  and  i  ■  g  and  k  ■  h  for  some  iteration  j  ( 1  s  j  <  m)  of  the  "for  i"  loop. 

Note  that  if  and  only  if  P  ,  P.  6  P  and  g  <  h. 

M  n  g  n 


♦ 

For  any  program  P  ana  any  relations  Q,  R  c  (Px  P),  Q  £  R  if  and  only  if 
for  all  P.,  Pfc  €  P,  Q*  implies  Rj|. 
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PROOF  OF  ALGORITHM 

The  algorithm  (Figures  la  and  lb)  generates  the  relations  and  given  the 
relations  qS  and  ^T,  the  input  sets  I.  (1  <. :  «  N),  and  the  output  sets  O.  ( t  «:  i  s  N). 
It  will  be  shown  that  =  S  and  ^T  *  T.  The  algorithm  functions  as  follows. 

The  body  of  the  "for  i"  loop  (steps  3  to  17)  is  executed  once  for  each  arc  from 

P.  to  P.  such  that  P.,  P  €  P  and  i  <  k.  These  arcs  are  sufficient  because 
l  k  l*  k 

qS,  T,  S,  and  T  are  strongly  anti-symme'.nc.  The  order  of  the  arcs  (steps  1 

and  2)  guarantees  that  all  sequences  of  arcs  connecting  two  processes  will  be 

determined  before  the  single  arc  connecting  the  processes  is  considered. 

Therefore,  all  indirect  T  paths  can  be  determined  without  comparing  the  input- 

output  sets  of  the  end  processes.  For  each  iteration  of  the  "foi  loop,  if  the 

arc  from  P^  to  P^  is  not  already  in  mT  (step  3),  then  it  must  either  be  in  S,  or 

P.  and  P  can  be  executed  concurrently.  Therefore,  if  the  arc  is  not  in  1  S 
i  k  m 

(step4),  then  P^and  can  be  executed  concurrently.  If  the  arc  is  in  (step  4) , 

then  the  input-output  comparision  must  be  made  (step  5).  If  the  intersection  is 

non-empty,  then  the  arc  is  in  S  and  T,  and  is  added  to  m+1T  (step  6).  V  the 

intersection  is  empty,  then  P^  and  P^  can  be  executed  concurrently  and  the  arc 

will  be  deleted  from  S  (step  9).  To  ensure  that  the  arc  from  P,  to  P  is  the 
m  IK 

only  arc  deleted  from  *S,  arcs  are  added  to  *S  (steps  10  through  13).  Steps 
m  m+ 1 

10  and  11  guarantee  that  there  is  a  sequence  of  m+JS  arcs  connecting  to  P^  from 

all  P  where  S while  steps  12  and  13  guarantee  that  there  are  ,S  ores 
j  mi  m+ 1 

it 

connecting  P.  to  all  P  where  S  ,.  Whenever  there  is  an  . ,  T  arc  from  P, 

“  l  j  m  j  m+ 1  i 

to  P^  (step  14  or  step  6),  then  steps  15  through  17  are  performed.  Since  S  is  a 

cover,  step  16  is  included  to  ensure  that  no  sequence  of  .  T  arcs  ending  in 

m+  * 
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an  arc  from  p  to  is  an  arc  in  ^S.  Similarly,  since  T  is  transitive,  step 

17  includes  all  sequences  of  m+,T  arcs  ending  in  an  arc  Irom  P.  to  P^  as  arcs 

in  T. 
m+1 

Lemma  1.  For  all  m(0  s  m  £  M),  T  c  T. 

-  m  _ 

Proof.  T  c  T  by  Definition  7.  Assume  for  any  m(0  s  m  <  M)  that  T-  T. 
o  —  m  - 

St 

During  the  (m  +  1)^  iteration  of  tne  "for  i"  loop  (steps  3  to  17),  arcs  are  added 

to  ,T  only  at  steps  6  and  17.  If  .T.1  is  added  at  step  6,  then  O.  HI  0 
m+1  J  *  m+ 1  k  r  *  lk 

(step  5)  and  by  Definition  5,  T.\  If  the  arc  is  added  at  step  17,  then 

k  m+ 1  k 

t!  (step  17).  By  hypothesis  T?  mplies  T*.  T,1  since  either  O.  0  1  ?  0 

mi  m  i  lk  lk 

(step  5)  or  T*  (step  14).  But  T?  and  T*  imply  T^,  since  T  is  transitive 
in  K  IKK 

(Definition  5).  Therefore,  all  m+jT  arcs  added  during  the  (m  +  l)8t  iteration 

are  in  T.  Since  by  hypothesis,  all  other  T  arcs  are  in  T  it  follows  that 

m+ 1 

,T  -  T.  By  induction  on  m,  T  c  T  for  all  m(0  *■  m  *  M). 
m+ 1  —  m  — 


Lemma  2.  For  all  m  (0<  m  iM),  Tc  (_Tfl  S). 
-  m  —  m  m 


Proof.  Tc  (  T  n  S)  by  Definition  8.  Assume  for  any  m(0  *  m  <  M)  that 
o—o  o 

mT-  ''m7  nm5»-  “  "M  »*1T S  1T  n  „*1S+  ,6e”  •“1“r  '0o’e  m*lT 

9t 

arc  was  added  or  some  S  arc  was  deleted  during  the  (m  +  1)  iteration  of  the 

II* 

"for  i"  loop.  If  T*  was  t  ^ed  at  step  6,  then  .S.  (step  4)  and  thus 

m+ 1  k  iuT  i  k 

f(  , T  n  ,S).\  If  S.1  is  deleted  at  step  9,  then  t£,  (step  3),  and  since  no 
m+ 1  m+  lk  m  K  m  * 

gt  t 

,T  arcs  are  added  during  the  (m+1)  iteration,  T  c  (  T  ft  _S)  implies 
m+ 1  m  ~  m  m 

,  T  c  l(  T  H  S),  If  S.^  is  deleted  at  step  16  or  ^.T^  is  added 
m+1  -  m+1  m+1  m  k  m+1  k 

at  step  17,  then  either  m+jT^  (step  6)  and  (step  4)  or  (step  14). 

In  either  case  *(  .TO  ,  .S)/.  But  if  was  deleted  or  ,  ,T^  added 
m+1  m+1  k  m  k  m+1  k 


-  -  u*:? 

*  * : 


n  *%*  ■ '  '  rt-v 


(steps  16  and  17),  then  mTj,  and  thus  by  hypothesis  *<mT  H  mS)|.  But  j  <  i  <  k  and 

St  £ 

all  S  arcs  deleted  during  the  (m  t-  1)  iteration  are  of  the  form  S,  where 
m  m  h 

h  ■  k  (step  16),  and  no  Tares  are  deleted.  Therefore,  *(  .TO  ,S)?.  Then 
m  m+  i  m+ 1  i 

'W  "  n^!S,k-  t"h  ’<m*lT  "  m+lS)!  “d  *W  1T  n  to  *“ 

cu..  then.  ^S).  By  Induction  on  m.  Jc  '(J  n  J S ) 

for  all  m(0  sms  M). 


Lemma  3.  For  all  m(0  sms  M),  T  c  S. 

-  —  m 

Proof.  If  m  «  0,  then  bv  Definition  8,  T  c  *S.  The  only  S  arcs  which  are  deleted 

—  o  m 

st  i 

during  the  (m  +  1)  iteration  of  the  "for  i"  loop  are  at  steps  9  and  16.  If  is 

deleted  at  step  9,  then  with  the  exception  of  S*  itself  all  sequences  of  S  arcs 

m  K  m 

beginning  with  arc  retained  in  ^jS  (steps  12  and  13).  All  sequences  of  mS 

arcs  ending  with  S*  are  retained  in  .  ,S  (steps  10  and  11),  with  the  exception 
m  k  m+i 

of  3*  and  S,^  (step  11)  where  1  *  j  <  i  (step  10)  and  T^  (step  11).  The  arc 
in  K  m  k  m  k 

mSk  need  not  be  retained  since  Oj  D  1^  ■  ^  (step  8).  That  is,  T^.  By  Lemma  2, 

T^  (step  11)  Implies  *(  TH  S)A  But  _T*  (*t«p  3)  so  that  the  sequence  of 
m  k  m  m  k  in  x 

(TO  S)  arcs  from  P,  to  P.  cannot  contain  an  arc  from  P,  to  P. .  Thus  the  arc 
mm  j  x  lx 

is  retained  when  s}  (step  8)  is  deleted.  It  ( 1  a  j  <  i)  is  deleted  at 
m  k  m  k  m  k 

step  16,  then  T '  (step  3)  and  (step  16),  By  Lemma  2,  *(  T  n  S)* 

m  i  mi  m  m  k 

and  *(  T  n  S)|,  which  implies  there  is  a  sequence  of  S  arcs  from  P.  to  P. 
m  ml  m  j  k 

other  than  the  single  arc  sA  Therefore,  deleting  the  arc  does  not  delete 

in  i  m  k 

the  arc  *sA  Then  in  all  cases  T  c  *8  implies  T  c  *S.  By  induction  on  m, 
m  K  —  m  *■  n*r  i 

T  c  *S  for  all  m  (0  *  m  s  M). 

~  m 


V  '  *• 

\  • 
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Lemma  4.  For  all  m(0  s  m  <  M).  SO  Cc  Tfl  C. 
-  m  m  —  m  m 


Proof.  If  m  =0,  then  qC  =  g  and  we  are  finished.  Assume  for  any  m(0  *  m  <  M) 

that  S  n  C  c  T  n  C.  Let  i,k  be  respectively  the  values  of  i  and  k  during  the 
mm— mm  6 

(nvfl)3t  iteration  of  the  "for  i"  loop.  Then  consider  any  P  ,  €  P  such  that 

.  ,C®  and  ,S®.  ,C®  (Definition  1CJ  implies  that  either  C®,  or  e  'i  and  h  «=  k. 

m+1  h  m+1  h  m+1  h  m  h  6 

st 

S  arcs  are  added  only  at  steps  11  and  13,  and  durmg  the  (m+1)  iteration  none 


m+1 

of  The  ,S  arcs  added  are  in  ,C  (3teps  10  and  12).  If  C®,  then  ,S®  implies 
m+ 1  m+ 1  m  h  m+ 1  r. 

S®  and  bv  hypothesis  T®,  but  no  T  arcs  are  deleted,  so  ,  T®.  Other-wise, 
m  h  -  m  h  m  m+i  h  * 

p  =  l  and  h  =  k.  and  by  the  above  argument  .S,1  implies  S*.  Then  if  T.1 
^  •  j  b  m+l  g  r  m  k  m  k 

(step  14),  m+  j Tj\  If  rnTj|  (step  3),  then  mS^  (step  4)  and  m+jS^  (step  9)  require 

that  step  6  and  not  step  9  be  executed.  But  by  step  6,  Then  in  any  case, 

any  arc  in  both  .  ,S  and  .  ,C  is  also  in  .  ,T,  By  induction  on  m, 
m+ 1  m+ 1  m+  i 

SO  c  c  TO  C  for  all  m  (0  *  m  s.  M). 
m  m  —  m  m 

Lemma  5.  For  all  m(0  *  m  x  M)  and  for  any  P„  P  .  P,  €  P.  if  C®.  TJ 
-  j  g  n  m  h  m  g 

and  T*.  then  both  and  S^. 

m  h  m  h  m  h 


♦ 

t 

I 


Proof.  If  m  ■  0,  then  C  ■  t  and  we  are  finished, 
o 

Assume  Lemma  5  to  be  true  for  any  m(0  x  m  <  M).  Then  consider  any  P.,  P  ,  P. 

j  g  h 

€P  such  that  m+1C*  m+1T^,  and  *  L8mm*  m+11?  lmpU'’“  ^  “d 

since  T  is  strongly  anti -symmetric  (Definition  3),  g  <  h.  Similarly,  m+J 

implies  j  <  g.  If  C®,  then  C^,  since  J  <  g  <  h.  Then  and  T®,  since  none 
in  h  mg  mg  mh 

of  the  ,T  arcs  added  during  the  (m+l)a*  Iteration  (step  6  and  17)  are  €  C.  By 
1114*  i  m 

hypothesis  C®,  T^,  and  T®  imply  T^  and  sA  But  no  T  arcs  arc  deleted, 

m  hm  g  mh  mh  mh  m  ' 

so  ,T^.  If  .S^,  then  since  S^,  ,S^  must  have  been  added  during  the 

m+1  h  m+1  h  m  h  m+1  h  6 
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*  V  *s 

*1-  'i- 


y- 

*. 

y  M 


ib  *** 


'<*is-> 


'••4e*TMr.~_\  t'TV: 


.St 


(ro+1)  iteration  (step  11  or  step  13).  But  it  could  not  have  been  added  at  step  11, 

since  then  mT^  (step  11).  Neither  could  m+ jS^  have  been  added  at  step  13,  since  then 

h  >  k  (steps  13  and  14),  out  by  C®,  h  s  k.  Otherwise,  ,C?  and  C®  that  is 

m  n  m+ 1  h  m  h  * 

g  *  i  and  h  *  k.  Then  mTj£  (step  14)  or  m+1T^  was  added  (r  tep  6),  In  either  case 

for  any  j  <  i  (step  15),  (that  is  mTJ.)  implies  m+1S^  (step  16)  and  m+jT^ 

(step  17).  Then  in  all  cases  ,C®,  ,TJ,  and  T®  imply  TJ  and  S1 

m+1  h*  m+1  g*  "m+l^h  impv  m+1  h  ana  m+lV 

By  induction  on  m,  for  all  m(0  s  m  s  M)  and  any  P,  P  .  P,€P,  Cg  T1  and 

j  g  h  m  h  m  g' 

T®  imply  tI  and  S/j. 
mh  m  h  mh 


It 

if 


-4 

V 


Lemma  6.  ..Sc.  .C. 

M  —  M 


Proof.  From  the  algorithm  we  see  that  all  S  arcs  added  during  the  execution  of 
the  algorithm  are  of  the  form  MS^  (steps  10  and  11),  where  1  s  j  <  i  <  k  <  N.  or  of 


the  form 


M 


S  (steps  12  and  13).  where  1  «  i  <  k  <  j  s  N.  Then  since  Sc  C 
1  o  -  M 


(Definition  8),  all  arcs  of  S  must  be  in  ..C, 

M  M 


Theorem  1.  ,.T  ■  r. 


Proof.  By  Lemma  4,  MSnMCc  MT,  but  since  MS  c  MC  (Lemma  6),  then 

MS  -  MT'  There,or«»  jis  £  j^T,  and  since  Tc^S  (Lemma  3),  we  have  T  c  ^T. 

mT  c  t  (Lemma  1)  and  T  £  MC  (Definition  5),  so  MT  c  MC.  Therefore,  ^T  is 

transitive  since  T  D  C  is  transitive  (Lemma  5)  and  „T  n  „C  -  T.  But  if  T 
w  M  M  M  M 

ie  transitive,  then  *  T  ■  T,  and  since  we  already  have  T  c  *  T,  T^  T,  Finally, 

by  Lemma  1,  T  c  T,  so  T  ■  T. 

M  “  M 


Theorem  2, 


M 


S  -S. 


Proof. 


M 


s  E  mt  8ince  MS  n  MC  c  mT  (Lemma  4)  and  MS  c  MC  (Lemma  6).  But 


M 


M 


M 


M 


yp  E  mt  ^Pliea  c  j^T.  and  since  MT  -  T  (Theorem  1),  c  ‘t.  Then  since 


3k.ro 


\  ‘ 
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T  is  transitive  {Definition  5),  tT  *  T,  and  therefore  c  T.  But  Tc^S  (Lemma  3) 

so  *  S  *  T.  By  Lemma  5,  for  any  P  ,P  ,  P  €P,  if  C®,  wt\  and  T®,  then 
M  j  g  h  ’  M  h'  M  g'  M  h'  h 

MS^.  But  ^T  =  T  (Theorem  1)  and  T  c  (Definition  5),  so  for  any  P^,  P^,  P^ 

£P,  if  and  T®  then  S^.  We  already  have  *  S  =  T.  Therefore  ,  S  *  S  by 
ghMh  M  M 

Definition  6. 


PROOF  OF  MINIMAL  COMPARISONS 

It  will  now  be  shown  that  no  algoi-tihm  can  produce  S  and  T  from  and  ^T  with 
fewer  comparisons  between  input  and  output  sets.  This  will  be  done  by  first 
showing  that  one  comparison  must  be  made  for  each  arc  that  is  in  S  and  not  in  i , 
and  that  one  comparison  must  be  made  for  each  arc  which  is  not  in  T.  It  will 
then  be  shown  that  each  input-output  set  comparison  in  the  algor:*hm  identifies  a 
unique  arc  which  is  either  in  S  and  not  in  qT,  or  is  in  and  is  not  in  T,  and  that 
no  comparison  is  made  more  than  once. 

Lemma  7.  For  all  m(0  <  m  <  M).  S  c  *8. 

■  m  —  o 

Proof.  qS  c  Si,  by  the  definition  of  tranaitive  closure.  Assume  for  any 

m(0  *  m  <  M)  that  £  *S.  During  the  (m+ 1)**  iteration  of  the  "for  i"  loop,  arcs 

are  added  to  m+1S  only  at  steps  11  and  13.  If  m+JS^  is  added  at  step  11,  then 

msj  (step  11)  and  mS*  (step  4).  Therefore,  by  hypothesis  *sj  and  *8*.  Then  by 

the  definition  of  transitive  closure,  *8^.  Similarly,  if  m+isj  is  added  at  step  13, 

then  S*  (step  4)  and  8*  (step  13),  so  that  by  hypothesis  V*  and  *8*  and 
m  k  m  )  ok  o  j 

therefore  V.  Then  m+  jS  £  ^s,  since  all  m+jS  arcs  added  during  the  (m+l)8t 

iteration  are  in  *S  and  by  hypothesis  all  S  arcs  are  in  lS.  By  induction  on  m. 
o  mo 

Sc  *s  for  all  m(0  s  m  *  M). 
m  —  o 

13 
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Theorem  3.  The  number  of  input -output  set  comparisons  is  minimal. 


Proof.  Let  P.,  P^  €  P  such  that  For  an  algorithm  to  establish  whether  or 

not  S*,  it  must  be  at  least  able  to  determine  whether  T^,  since  S£  T.  Unless 

it  is  given  that  T^  (that  is,  unless  qT^),  it  must  be  shown  that  either  O.  0  ^  0 

or  that  there  is  a  P.  €  P  such  that  both  T1  and  T,J  (Definition  5).  If  S,\  then 
J  i  k  k' 

there  is  no  P^  €P  such  that  both  t!  and  T^  (Definition  6).  Thus,  the  comparison 

O.  fl  I.  must  be  made.  If  T,1,  then  since  T  is  transitive  there  can  be  no  P  €P  such 
l  k  k'  j 

that  both  Tj  and  T^.  Thus,  the  comparison  O.  H  1^  must  be  made.  That  is,  one 

input-output  set  comparison  must  be  made  for  each  arc  that  is  in  S  and  not  in  oT, 

and  one  comparison  musi  be  made  for  each  *S  arc  which  is  not  in  T. 

o 


If  during  the  (m+  l)8t  iteration  of  the  "for  i"  loop  (1  <  m  <  M),  the  comparison  O.  n  ifc 

is  made  (step  5),  then  S  *(step  4).  But  then  *S. ,  since  S  £  *s  (Lemma  7).  Also, 

T^  (step  3),  and  therefore  qT^,  since  the  algorithm  does  not  delete  any  arcs  from 

T.  If  it  happens  that  0.0  1.^,  then  the  arc  S*  is  not  deleted  during  the  (nH-l)St 
IK  m  K 

iteration.  But  .S*  (steps  9  and  16)  can  not  be  deleted  during  any  subsequent 

iut  1  K 

iteration  of  the  "for  i"  loop.  That  is,  and  therefore  S*.  since MS  »S(Theo- 

— i  st 

rem  2),  If  O,  0 1.  •  #,  then  T.  (step  3)  and  step  6  is  not  executed  during  the  (m+1) 
i  k  m  k 

iteration,  so  ^,7*.  But  „T*  (steps  6  and  17)  can  not  be  added  during  any  sub- 
m+ 1  1C  M  K 

sequent  iteration  of  the  "for  i"  loop.  Thus,  j^T*,  and  therefore  7*,  since  MT»T 

(Theorem  1).  That  is,  each  input-output  comparison  n  1^  in  the  algorithm 

identifies  a  unique  arc  (from  P,  to  P. )  which  is  either  in  S  and  not  in  T,  or  is  in 

1  K  O 

and  is  not  in  T.  Finally,  none  of  these  comparisons  is  made  more  than  once 


st 

since  only  the  sets  O..  and  1^  are  compared  during  the  (m+1)  iteration 
(step  5),  and  no  two  iterations  of  the  "for  i"  loop  have  the  same  value  for  the 
pair  i,  V  (steps  1  and  2). 

Therefore,  every  input -output  comparison  made  by  the  algorithm  is  necessary. 


« 
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ANALYSIS  OF  FORMAL  PROGRAM  STRUCTURES 


The  ability  to  apply  the  algorithm  for  detection  of  essential  serial  order  with  a 
nonempty  0T  allows  mo**e  freedom  in  the  use  of  the  analysis.  This  facility  is 
investigated  relative  to  certain  formal  program  structures  and  the  advantages 
which  are  relevant  to  subroutines,  conditionals  and  serial  input-output  calls  are 
explained.  With  he  explicit  exclusion  of  arrays,  it  is  shown  that  loops  and  re¬ 
cursive  subroutines  can  be  completely  analyzed  with  only  two  instances  of  each 
process.  Arrays  will  be  considered  in  the  next  report. 


SUBROUTINES 

The  advantages  of  the  non-empty  QT  arise  in  the  analysis  of  program  structures 
such  as  subroutines.  A  subroutine  (whether  open  or  cloaca)  need  not  be  analyzed 
for  each  call,  but  may  be  analyzed  onlj’  once  and  the  results  of  that  analysis  used 
at  each  call  on  the  subroutine.  This  is  accomplished  by  first  analyzing  the  sub¬ 
routine  and  then  using  the  resulting  S  and  T  relating  the  intra-subroutine  pro¬ 
cesses  as  qS  nd  QT,  respectively,  for  each  program  call  on  the  subroutine.  The 
program  analysis  will  then  identify  all  instances  of  parallelism  without  duplicating 
any  comparison  of  the  intra-subroutine  input -output  sets  at  the  various 
subroutine  calls. 

An  alternative  method  for  handling  subroutines  reduces  the  number  of  processes 
used  in  the  analysis  and,  therefore,  the  size  of  S  and  T.  In  this  scheme,  the 
subroutine  is  analyzed  once  separately  trom  the  program.  Then  rather  than 
inserting  the  subroutine  analysis  results  into  the  program  at  each  call,  the  pro¬ 
gram  is  analyzed  with  each  subroutine  call  serving  as  a  single  process.  In  this 
scheme,  parallelism  will  not  be  found  between  processes  where  one  of  the  pro¬ 
cesses  is  external  to  the  subroutine  but  cannot  be  executed  in  parallel  wit),  the 
entire  subroutine,  and  where  the  other  process  is  interior  to  the  subroutine. 

The  above  methods  are  not  applicable  •*  recursive  subroutines,  since  the 
substitution  process  is  nontermlnatinf 

LOOPS 

The  algorithm  as  described  can  be  used  to  analyze  a  lo  .by  stretching  it  out  into 
a  sequence  of  iterations.  This  analysis,  however,  cannot  be  performed  until 
run  time  if  the  number  of  iterations  is  data  dependent.  Even  if  the  number  of 
iterations  can  be  determined  at  compile  time,  the  number  of  processes  produced 
by  flattening  out  the  loop  may  exceed  the  handling  capabilities. 

A  method  for  loop  handling  which  takes  advantage  of  the  similarities  between 
successive  iterations  of  a  loop  and  still  recognizes  those  instances  of  parallel¬ 
ism  determined  by  input-output  set  relations  is  now  developed.  Inil  .ally  we  will 
assume  that  the  programs  under  consideration  either  do  not  contain  arrays  or 
that  each  array  is  treated  a3  a  single  variable.  'Phis  restriction  guarantees  that 
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input -output  sets  are  not  a  function  of  the  'teraticn.  That  is,  for  any  iteration  of 
a  loop,  if  an  instance  of  a  variable  appears  in  the  input  (or  output)  set  for  some 
process,  then  for  any  other  iteration  ol  the  loop,  another  instance  of  thai 
variable  will  appear  in  the  input  (or  output)  set  of  the  corresponding  process. 

Since  each  iteration  of  a  loop  has  the  same  processes  in  the  same  given  order 
and  with  the  same  input-output  names,  analysis  of  any  iteration  of  a  loop  will 
identify  the  intra- iteration  parallelism  for  all  iterations  of  the  loop.  The  array 
handling  technique  mentioned  above  guarantees  that  analysis  of  any  two  consecu¬ 
tive  iterations  of  a  loop  will  identify  all  inter-iteration  parallelism,  since  direct 
essential  ordering  of  processes  can  exist  only  between  processes  in  tne  same  or 
consecutive  iterations.  Therefore,  loops  c-n  be  handled  by  analyzing  only  two 
consecutive  iterations  of  each  loop. 


CONDITIONALS 

There  are  several  run-time  philosophies  wnich  maybe  used  in  conjunction  with 
conditionals.  One  approach  permits  both  branches  of  the  conditionals  and  the 
condition  itself  to  be  executed  concurrently.  When  evaluation  of  the  condition 
is  complete,  one  oft.ie  branches  will  then  be  inhibited.  This  method 
reduces  the  duration  of  the  program  at  the  expense  of  performing  some 
computation  whose  outputs  will  not  be  used. 

An  alternative  approach  will,  however,  be  taken  here.  The  goal  will  be  to 
initiate  each  process  as  soon  as  possible  without  executing  processes  unneces¬ 
sarily.  This  may  be  done  by  evaluating  the  condition  before  either  branch  of 
the  conditional  is  initiated  and  then  executing  only  the  single  necessary  branch. 
This  approach  does  not  prohibit  processes  common  to  both  branches  of  the 
conditional  from  being  exeruied  concurrently  with  the  evaluation  of  the  condition. 

Conditionals  can  be  analyzed  as  any  other  processes,  except  that  the  given  0T 
will  be  nonempty.  For  example,  let  process  Pj  be  the  condition,  P2  and  Pg  be 
local  to  one  branch  of  the  conditional,  P^  and  Pg  be  local  to  the  other  branch  of 
the  conditional,  and  Pg  be  common  to  both  branches.  Then  QS  will  be  the  given 
order  of  the  processes  as  shown  in  Figure  2  .  T,  however,  will  have  four  arcs, 
one  arc  each  to  indicate  the  serial  ordering  between  the  condition  evaluation  and 
the  processes  local  to  the  conditional  branches.  S*  and  S 1  are  included  to 
guarantee  that  0T  c  l(0S  0  QX).  *  o  3  o 


*  t 

The  need  for  the  requirement  T  c  (  S  n  T),  introduced  in  Definition  8,  is 

0  —  00 

illustrated  by  this  example.  If  S*  were  not  included  and  O,  0  I  «  $ ,  then 

O  6  2  3 

the  algorithm  would  not  generate  since  qT*,  and  consequently 

since  even  though  S*. 
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SERIAL  INPUT-OUTPUT  CALLS 

Many  oneway  input  and  output  devices,  such  as  card  readers  or  line  printers,  are 
read  or  written  serially.  To  ensure  that  the  information  received  from  (or  trans¬ 
mitted  to)  these  devices  is  interpreted  (or  displayed)  as  intended,  the  given 
order  of  reference  must  be  maintained  For  example,  if  lines  were  sent  to  a  line 
printer  in  any  order  other  than  that  given  by  the  programmer,  the  intended  format 
would  be  disrupted.  Thus,  for  each  serial  device  a  qT  arc  will  connect  those 
pairs  of  processes  which  include  consecutive  references  to  that  device.  Let 
Pj,  Pg,  Po,  P^.  Pg.  and  Pg,  in  that  order,  comprise  a  program,  and  let  P^.  P3 
and  P5  include  reference  to  a  particular  serial  input  or  output  device.  The  0S 
and  QT  for  this  program  will  then  appear  as  shown  in  Figure  4. 


*9 
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PROGRAMMING  LANGUAGE  FEATURES  EFFECTING 
PARALLELISM 

The  language -independent  approach  to  recognizing  parallelism  through  study  of 
formal  translator  structures  permits  identifying  general  aspects  of  essential 
ordering  without  getting  involved  with  features  of  particular  languages. 

Specific  language -dependent  features  are  also  important  since  the  intended 
applications  are  programs  written  in  actual  languages.  By  considering  the 
specific  differences  in  actual  languages,  a  comparative  basis  can  be  established 
for  recommending  that  particular  features  be  used  for  parallel  recognition,  that 
features  be  used  as  essential  order  indications,  and,  indeed,  that  languages  contain 
particular  features  to  aid  in  the  recognition  of  parallelism. 

Some  specific  features  of  languages  will  now  be  investigated.  Loop  statements 
yield  the  largest  potential  for  parallelism,  since  each  set  of  embedded  loops 
multiplies  the  number  of  opportunities  for  parallel  execution.  The  FORTRAN  DO-, 
ALGOL  FOR-,  and  COBOL  PERFORM-  statements  are  analyzed.  UpcordjUonal 
transfers  cause  problems  in  the  recognition  of  loops  and  in  crossing  the  scope 
boundary  of  variables.  Conditionals  which  are  data-dependent  pose  the  principal 
impediment  to  parallelism.  Evaluating  groups  of  conditionals  in  parallel,  rather 
than  scattering  them  through  a  program,  minimizes  the  number  of  separately 
ordered  parts  of  a  program.  Duration  of  definition  of  an  instance  of  a  variable  is 
important  to  the  mapping  of  instances  into  memory  on  a  noninterfering  basis.  A 
beginning  on  this  analysis  is  reported. 


LOOPS 

Loops  play  a  dominant  role  in  programs  written  in  present  programming  languages. 
They  permit  programmers  to  iteratively  express  repetitive  pi  cresses  with  economy 
of  program.  The  iterative  nature  of  loop  control  is  adequate  fu  -  sequential  execu¬ 
tion.  However,  the  iterative  form  impedea  parallel  setup  of  the  control  for  loop 
bodies.  Gosden*  has  concentrated  on  expltcit  loop  constructs  as  the  moot  promising 
sources  for  parallel  activity.  Hs  proposes  that  s  large  fraction  of  all  .oepa  are 
parallel,  both  in  the  control  and  the  loop  bodies,  and  recommends  explicitly  adding 
the  ability  to  specify  loops  as  either  parallel  or  Iterative  in  the  programming 
language. 

We  will  now  consider  the  control  of  loop#  and  parallel  establishment  of  multiple 
paths  of  control  even  when  the  control  mechanism  is  iteratively  expressed  in  the 
programming  language. 

Some  opportunities  to  establish  in  parallel  more  than  one  execution  of  a  loop  body 
are  determined  by  the  algorithm.  The  algorithm  requires  for  concurrent  execution 
that  not  only  must  the  control  variable  be  independent  of  its  predecessor  control 
variable,  but  also  independent  of  its  predecessor  loop  body.  The  loop  control 
statements  in  FORTRAN,  ALGOL,  and  COBOL  will  be  compared  to  see  what  other 
opportunities  exist  for  establishing  concurrent  paths  of  control  for  loop  bodies. 
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At  compile  time,  if  the  number  of  executions  of  a  loop  is  recognizable  as  an  integer, 
then  parallel  paths  may  be  established.  If  the  number  of  executions  is  recognizable 
at  execute  time  upon  encountering  the  loop  entry,  then  parallel  path  controls  may 
be  established  at  that  time  to  initiate  that  number  of  loops. 

Conditional  statements  within  a  loop  body  that  can  lead  outside  the  loop  with  no 
intent  to  return  are  possible  in  ALGOL  and  FORTRAN.  COBOL  and  ALGOL  have 
explicit  forms  of  loop  control  including  condition  evaluation  to  determine  loop 
completion.  Evaluating  such  a  condition  generally  depends  on  loop-created  data 
(otherwise  an  explicit  form  for  indicating  the  number  of  iterations  of  the  loop 
would  have  been  used).  Consequently,  there  is  an  essential  order  between  cycles 
of  the  loop  when  a  condition  determines  the  exit.  In  some  cases  it  may  be  possible 
to  reformulate  the  loop  to  separate  all  condition  evaluations  from  loop  body 
execution. 

FORTRAN  and  COBOL  program  units  are  characterized  by  static  storage  require¬ 
ments  determinable  at  compile  time.  ALGOL  program  units,  on  the  other  hand, 
assume  dynamic  storage  requirements.  The  effect  of  this  difference  on  loop 
control  is  to  allow  significantly  more  ways  to  defer  to  execute  time  the  decision  on 
number  of  loop  executions  in  ALGOL,  and  to  make  loop  executions  essentially 
ordered. 

Further  interpretations  and  restrictions  on  these  general  ideas  are  developed  in  the 
following  three  descriptions  of  the  particular  loop  statements  in  each  language. 

FORTRAN  DO  Statement* 

A  DO  statement  is  of  the  form 

DO  n  i  ■  nij,  mg 

where:  n  is  the  statement  label  of  an  executable  statement  occurring  as  the 

terminal  statement  of  the  associated  DO.  The  statement  must  follow 
the  DO  and  be  in  the  same  program  unit.  The  terminal  statement  may 
not  be  a  GO  TO  of  any  form,  arithmetic  IF,  RETURN,  STOP,  PAUSE, 

DO  statement,  nor  a  logical  IF  containing  any  of  these  forms.  In  effect, 
this  allows  only  the  DO  loop  control  to  follow  execution  of  the  terminal 
statement. 

i  is  an  integer  variable  name  of  the  control  variable, 
m  j  is  the  initial  parameter, 
m2  ie  the  termination  parameter, 

m3  is  the  incremental  1  parameter  if  present,  otherwise +  1  is  implied. 
*This  description  of  the  FORTRAN  DO  statement  is  adapted  from  reference  3. 
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Each  m,,  ma,  and  ma  is  either  an  integer  constant  or  integer  variable  reference. 

At  time  of  execution  each  must  be  positive  and  m ,  <  ma.  Ttie  range  of  the  DO  is 
the  set  of  executable  statements  following  the  DO  statement  through  the  terminal 
statement.  Procedure  actions  required  within  the  range  are  assumed  to  be 
temporarily  within  the  range. 

Redefining  (by  assigning  of  a  new  value  to)  any  of  i,  m^,  m8,  m3  is  prohibited 
during  the  execution  of  the  range  of  the  DO.  This  means  that  the  maximum  number 
of  executions  is  always  known  before  first  executing  the  range. 

The  DO  statement  execution  sequence  is  1)  i  *  mt;  2)  execute  range,  if  the  terminal 
statement  is  reached;  3)  i  «  i  +  m3,  if  i  <  m,  GO  TO  2);  4)  exit  with  DO  satisfied. 

Exiting  from  the  range  of  a  DO  may  occur  by  execution  of  a  GO  TO  statement  or  an 
arithmetic  IF,  that  is.  exiting  may  occur  without  satisfying  the  DO. 

A  GO  TO  or  arithmetic  IF  statement  may  not  cause  control  to  pass  into  the  range 
of  a  IX)  from  outside  its  range,  except  as  described  below  for  the  extended  range. 

All  values  of  the  control  variable  can  be  assigned  at  compile  time  if  the  following 
two  conditions  hold:  1)  mlf  m#,  and  m3  are  integer  constants,  2)  there  occurs 
no  exit  from  the  range  of  the  DO  by  execution  of  a  GO  TO  statement  or  an  arith¬ 
metic  IF  statement.  If  these  conditions  hold,  it  is  possible  to  establish  k  »  1  plus 
the  greatest  integer  in  (ma  -  m,)/ms  parallel  assignments. 

If  condition  1)  is  relaxed  to  pennit  integer  variables  for  any  of  the  m,,  m3,  or  m3's, 
then  at  compile  time  it  is  possible  to  add  the  above  computation  for  l:  as  a  control 
process  which  can  then  establish  that  number  of  parallel  control  patlis  for  executing 
the  ranges 

Nested  DO  statements  are  possible  so  long  as  the  range  of  the  contained  DO  is  a 
subset  of  the  containing  DO.  Execution  order  is  inside  out.  A  complete  nested  nest 
of  DO  statements  occurs  when  the  first  occurring  terminal  statement  of  any  DO 
statement  follows  the  last  occurring  DO  statement  and  the  first  occurring  DO 
statement  of  the  set  is  not  In  the  range  of  any  DO  statement.  For  such  a  completely 
nested  nest  of  DO  statemsnts,  an  extsnded  range  Is  permitted  for  the  innermost  of 
the  DO  statements,  from  which  control  may  pass  external  to  the  next  and  return  to 
the  innermost.  No  recursive  use  of  the  extended  range  is  permitted. 

It  is  not  necessary  that  the  range  of  an  embedded  DO  state,  “mt  be  parallel  for  the 
range  of  an  outer  DO  to  be  parallel,  A  nest  of  DO  statements  may  be  totally 
parallel,  if  all  DO  statements  in  the  nest  are  parallel.  In  this  case  the  product 
kj  *  k,  x  . . .  xk,  paths  of  control  may  be  established. 


ALGOL  FOR  Statement* 


The  syntax  of  the  ALGOL  FOR  statement  elements  (given  in  a  modified  Backus 
normal  form)  which  are  important  for  tnis  loop  discussion  are  as  fellows. 

(for  list  element)  (arithmetic-expression)  I  (arithmetic -expression)  step 
(arithmetic-expression  )  until  (arithmetic -expression)  | 
(arithmetic -expression)  while  (Boolean -expression) 

(for  list)  (for  list  element)  |  (for  list),  (for  list  element) 

(for  clause)  ::=  for  (variable)  :*  (for  list)  do 

(for  statement)  (for  clause )(statement)  |  (label) :  (for  statement) 

A  for  clause  causes  the  statement  which  it  precedes  (the  for  loop  body)  to  be 
repeatedly  executed  0  or  more  times.  In  addition,  it  performs  a  sequence  of 
assignments  to  the  control  variable  from  the  for  list. 

The  sequential  execution  expected  is  the  following:  1)  initialize  the  control  variable 
by  assignment  from  the  value  of  the  first  for  liot  element,  2)  test  for  an  invalid 
assignment:  if  it  is  invalid,  go  to  the  successor  statement  of  the  for  statement, 

3)  execute  the  statement  (exit  if  a  go  to  leading  outside  the  statement  is  encountered), 

4)  perform  the  next  assignment  fronrftfre  next  for  list  element  in  the  order  written 
:o  the  control  variable  doing  any  necessary  evaluation  of  arithmetic  expressions, 
using  the  current  values  of  primaries,  and  then  go  to  (2)  again. 

In  order  to  establish  parallel  paths  of  control  for  all  executions  of  the  loop  body, 
the  number  must  be  known  before  any  are  executed.  For  this  number  to  be  known, 
there  must  not  be  any  condition  which  is  dependent  upon  loop-created  data  that  can 
change  this  number.  Consequently,  for  lists  made  up  from  for  list  elements  of  the 
AE  or  AE  step  AE  until  AE  types  (category  1)  are  potentially  unordered.  Each  for 
list  element  of  the  AE  while  BE  type  (category  2)  imposes  an  essentially  ordereJT 
sequence  of  loop  body  executions.  A  for  list  may  consist  of  an  alternating  sequence 
of  for  lists  from  categories  1  and  2,  in  which  case  a  similar  sequence  of  potentially 
unordered  and  essentially  ordered  executions  of  the  loop  body  exist.  Any  data- 
dependent  conditional  in  the  for  which  can  cause  exit  from  the  loop  body 
Imposes  essential  order.  Hereafter,  we  assume  no  such  conditional  and, 
thus,  we  consider  only  for  list  elements  of  category  I. 

If  no  assignment  is  made  into  the  control  variable  by  any  statement  in  the  loop  body, 
then  all  its  values  are  obtained  from  the  for  list,  ALGOL  permits  assignment  to 
the  control  variable  or  to  primaries  in  the  arithmetic  expressions  of  the  step  AE 
until  AE  parts  of  a  for  list  element  to  be  made  in  the  loop  body.  If  such  assignments 


a 

This  is  a  partial  syntax  from  reference  4  adapted  by  leaving  undefined  some 
non-terminal  syntactic  elements  such  as  "(arithmetic-expression)". 
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are  unconditionally  made,  and  if  they  are  a  function  of  only  values  existing  prior  to 
the  for  statement,  or  of  the  prior  control  variable  of  these  primaries,  then  the  loop 
control  may  be  separately  analyzed  from  the  rest  of  the  loop  body. 

An  apparently  iterative  sequence  for  establishing  the  values  of  the  control  may  be 
replaced  by  parallel  enumeration  through  recognition  at  compile  time  of  the  avail¬ 
ability  of  the  values  of  all  primaries  necessary  for  determining  all  for  list  elements. 
Should  the  values  of  all  primaries  be  unsigned  numbers,  then  the  paths  can  actually 
be  established  at  compile  time.  If  any  of  these  primaries  is  a  variable  and  all  such 
variable  primaries  have  assignments  into  them  restricted  as  stated,  then  the  number 
of  paths  of  control  can  be  determined  prior  to  first  execution  of  the  noncontrol 
portion  of  the  loop  body. 

The  control  variable  becomes  undefined  if  exit  results  from  exhaustion  of  the  for 
list.  The  last  value  of  the  control  variable  is  preserved  if  exit  from  the  for 
statement  occurs  because  of  a  go  to  in  the  loop  body. 

Side  effects  of  a  procedure  call  can  cause  assignments  outside  its  body  or  exits 
other  than  the  return  to  point  of  call.  Such  procedure  calls  occur  in  the  for  body  or 
in  the  for  list.  Either  of  these  can  prevent  or  make  indeterminate  at  compile  time 
the  establishment  of  parallel  execution  of  the  for  body.  Huxtable8  has  classified 
procedures  as  follows:  normal  -  having  no  side  effects,  conditional  sneaks  -  side 
effects  are  conditional  on  context,  and  unconditional  sneaks.  The  conditions 
for  recognizing  normal  procedures  are  as  follows:  nc  OWN  variables,  nonlocal 
assignments,  abnormal  exits,  nor  use  of  any  switch;  internal  procedure  calls 
limited  to  normal  procedures;  parameters  exclude  label  and  switch;  and  no  explicit 
assignment  to  parameters  called  by  name.  Conditional  sneaks  are  the  same  as 
normal  except  that  explicit  assignment  to  parameters  called  by  name  is  permitted. 
All  other  procedures  are  assumed  to  be  unconditional  sneaks.  He  describes  a 
technique  for  classifying  procedures  which  involves  discovering  the  total  of  all 
possible  run-time  procedure  call  structures  of  the  program.  Although  further 
analysis  might  show  that  unconditional  sneaks  would  not  require  essential 
ordering,  the  effort  would  likely  be  greater  than  the  benefit  gained. 


COBOL  PERFORM  Statement* 

The  PERFORM  statement  is  used  to  depart  from  the  normal  sequence  of  execution 
in  order  to  execute  one  or  more  procedures  either  a  specified  number  of  times  or 
until  a  specified  condition  is  satisfied  and  then  return  control  to  the  normal 
sequence  -  the  statement  following  the  PERFORM. 

The  four  general  formats  are  as  follows: 


i 
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1)  PERFORM  procedure -name - 1  [THRU  procedure-name-2] 

2)  PERFORM  procedure -name - 1  [  THRU  procedure -name -2]  j>  TIMES 

3)  PERFORM  procedure -name -1  [  THRU  procedure-name-2]  UNTIL  condition  - 1 


4)  PERFORM  procedure -name- 1  [THRU  procedure-name-2] 


VARYING 


/index-name  -  ll 
( identifier- 1  J 


...  fiit.r.l-3 


FROM  <  literal -2 


i\menu-z  >BY<., 
\identifier-2  J  Llden  1  ier 

r index -name -s'* 


AFTER 


(index-name - 
^identifier 


ne-4l 

-4  J 


-J 


UNTIL  condition- 1 


FROM  <  literal-5 

^identifier-5 


H 


f literal-6 
identifier -6j 


.  >  UNTIL  condition-2 


[AFTER .  .  .  ] 


Each  procedure-name  is  the  name  of  a  section  or  paragraph  in  the  Procedure  Division. 
Each  identifier  represents  a  numeric  elementary  item  described  in  the  Data  Division. 

In  formats  2  and  4  with  the  AFTER  option,  each  identifier  represents  a  numeric  item 
with  no  positions  to  the  right  of  the  assumed  decimal  point.  Each  literal  represents 
a  numeric  literal. 

There  is  no  necessary  relationship  between  procedure-name- 1  and  procedure-name- 2, 
except  that  a  consecutive  sequence  of  operations  is  to  be  executed  in  every  case  be¬ 
ginning  at  procedure-name- 1  and  ending  with  procedure-name-2.  In  particular,  OG 
TO  and  PERFORM  statements  may  occur  in  the  sequence.  If  there  are  two  or  more 
direct  paths  to  the  return  point,  then  procedure-name-2  may  be  the  name  of  a  paragraph 
consisting  of  the  EXIT  statement,  to  which  all  of  these  paths  must  lead. 

Format  1  corresponds  to  a  call  of  a  procedure  without  actual  parameters.  In  format  2, 
the  procedures  are  performed  the  number  of  times  specified  and,  therefore,  parallel 
paths  of  control  may  be  unconditionally  established.  At  PERFORM  execution,  the  value 
of  identifier- 1  or  integer- 1  must  not  be  negative.  If  the  value  is  zero,  control  passes 
immediately  to  the  statement  following  the  PERFORM  statement.  Once  initiated,  any 
reference  to  Identifier- 1  has  no  affect  on  varying  the  number  of  times  the  procedures 
are  executed.  If  given  as  integer- 1,  the  control  may  be  set  up  in  parallel  at  compile 
time  as  long  as  the  procedures  in  separate  iterations  are  parallel.  If  given  ae 
identifier- 1,  the  number  of  control  setups  may  be  determined  from  the  value  of 
identifier- 1  on  encountering  the  PERFORM  at  execute  time. 

The  UNTIL  condition  parts  of  formats  3  and  4  preclude  parallel  execution,  except  in 
those  cases  where  the  conditions  are  fully  evaluatable  at  compile  time,  or  the  or¬ 
dered  set  of  results  of  condition  evaluations  are  determinable  prior  to  executing  the 
procedures.  To  achieve  the  equivalent  of  the  ALGOL  construct  until  AE.  the  condition 
would  compare  the  index  name  (or  identifier  with  the  value  of  the  desired  limit). 
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Format  4  permits  setting  up  one,  two,  or  three  control  variables,  testing  their 
corresponding  conditions,  and  if  all  are  false,  executing  the  procedures.  After  this, 
the  last  data-name  is  altered  by  the  appropriate  amount  and  the  corresponding 
condition  retested.  When  a  condition  is  true,  other  than  the  first,  that  data-name  is 
reinitialized,  and  the  next  preceding  data-name  is  augmented  and  tested.  When 
condition- 1  is  true,  the  PERFORM  is  finished. 

In  addition  to  the  restriction  on  parallelism  caused  by  the  UNTIL  condition  parts, 
control  variables  other  than  the  first  are  reinitialized  to  the  FROM  value  during  the 
PERFORM.  Their  values  may  be  altered  by  the  procedures  from  the  values  when 
the  PERFORM  was  encountered;  likewise,  the  index-names  and  identifiers  occur¬ 
ring  in  the  BY  part  of  any  of  the  control  variables  and  any  variables  occurring  as 
part  of  the  conditions  may  be  altered.  Any  such  alteration  will  prevent  parallel 
execution  unless  either  numeric  literals  (or  identifiers  whose  values  are  constrained 
similarly  to  the  ALGOL  primaries  used  in  aE)  are  used  for  these  alterations. 


UNCONDITIONAL  TRANSFERS 

An  unconditional  transfer  of  control  to  another  part  of  the  program  causes  the 
following  problems:  1)  possible  creation  of  loops  and  2'  crossing  a  boundary  of 
scope  of  variables. 

Potential  creation  of  loops  is  detectable.  An  algorithm  for  detecting  loops  given 
the  process  connection  matrix  has  been  given  by  Marimont.  7  All  program  loops 
created  by  unconditional  transfers  include  a  backward  jump.  Not  all  backward 
jumps  indicate  loops,  since  the  order  of  sequential  programs  can  be  scrambled 
using  unconditional  transfers.  At  compile  time,  the  possible  paths  of  control  must 
be  indicated.  If  paths  are  mutually  exclusive,  not  only  should  the  mechanism  for 
enabling  one  path  be  provided,  but  also  the  outputs  from  paths  not  taken  should  be 
made  invalid  and  the  particular  need  for  Inputs  required  by  such  paths  should  be 
released. 

During  the  execution  of  a  serial  program,  crossing  a  boundary  into  the  scope  of  a 
variable  serves  to  reserve  space  for,  but  assign  an  undefined  value  to,  any 
locally  named  (non-OWN)  variable  until  some  value  assignment  has  been  made  to 
it.  In  ALGOL,  in  particular,  a  variable  globally  named  the  name  as  a  locally 
named  variable  is  inaccessible  in  the  local  block.  Crossing  a  boundary  out  of  the 
scope  of  a  defined  variable  as  a  result  c*  ~n  unconditional  transfer  (or  otherwise) 
snould  result  in  that  local  variable  b  undefined  and  the  storage  allocated  to 

it  being  reh  seed. 

In  parallel  program  execution,  on  the  other  hand,  several  instances  of  variables  in 
different  scopes  having  the  same  name  may  be  accessed  concurrently.  Consequently, 
separate  internal  registers  are  required  for  all  valid  instances.  Also,  exiting  a 
scope  A  a  variable  is  not  sufficient  for  making  invalid  or  undefining  t  ie  variable 
since  some  other  process  may  still  require  it.  Consequently,  either  a  variable 
must  remain  defined  until  all  segments  of  programs  are  executed  which  require  it; 
or  separate  copies  of  the  variable  should  be  created  for  each  use,  in  which  case  use 
becomes  synonymouns  with  release  of  the  storage  ior  the  variable. 


CONDITIONALS 


Language  constructs  for  conditional  establishment  of  paths  of  control  based  upon  the 
result  of  a  single  condition  are  used  in  most  programming  languages.  The  form  is 
to  first  evaluate  the  condition  and  then  set  up  a  path  of  control  for  the  single  path 
corresponding  tr  the  result.  N-way  branching  constructs  can  be  reduced  to 
selection  of  one  of  N  alternatives  also  based  upon  a  single  condition. 

Conditional  establishment  of  a  single  path  of  control  presents  a  problem  in  deciding 
which  instance  of  a  register  is  referenced  by  a  process  following  tho  condition  when 
several  instances  .ould  be  meant,  depending  on  the  actual  path  executed.  Alterna¬ 
tives  for  solution  of  this  problem  are  based  on  the  duration  of  definition  of  an 
instance. 


SEQUENCE  OF  CONDITIONALS 

Languages  that  group  conditionals  for  evaluation  with  action  selected  as  the  first  to 
be  true  (decision  tables),  or  languages  that  have  a  list  of  condition  -  action  pairs  of 
which  the  first  condition  to  be  true  selects  the  action  (LISP)  -  provide  opportunity 
for  executing  the  conditionals  in  paralleL  When  logical  relations  :mong  a  group  of 
conditions  are  evaluated  to  select  a  particular  action  process,  the  individual  con¬ 
ditions  may  be  evaluated  in  parallel  as  long  as  1)  evaluation  of  any  conditional 
does  not  modify  the  result  of  some  other  conditional  in  the  same  group,  and  2)  all 
operands  required  for  evaluation  are  defined. 

The  first  qualification  may  be  met  by  having  temporary  storage  locations  into  which 
all  instances  of  store  operations  associated  with  conditional  evaluation  are  made, 
with  stacking  or  tugging  to  indicate  the  creating  condition.  A  conflict  may  occur 
because  of  name  reuse  in  atore  operations  temporary  to  or  incidental  to  the  con¬ 
dition  evaluations.  This  conflict  may  be  resolved  by  reference  to  a  stack  or  tag 
associated  with  the  name  occurring  in  the  nearest  condition  not  following  the  con¬ 
dition  being  evaluated.  By  this  means  proper  conditional  evaluation  will  take  place. 
The  values  of  named  variables  that  are  later  required  and  are  created  during 
evaluation  of  the  satisfied  condition  may  be  permanently  stored  prior  to  continuing. 

The  second  qualification  requires  defined  (valid)  operands  for  conditional 
evaluation.  To  permit  recognition  of  these  we  must  know  which  instance  of  each 
name  is  appropriate.  For  sane  conditionals,  no  Instance  of  an  operand  need  be 
appropriate.  For  example,  assume  that  creation  occurs  as  an  action  following 
some  conditional  evaluted  earlier  in  the  given  sequence.  That  prior  conditional 
must  have  become  true  and  that  action  taken  before  the  current  conditional  can  be 
true.  Therefore,  in  thia  case  the  conditional  is  InconsequentiaL  Several  creation 
points  may  be  appropriate.  With  aequentlal  evaluation,  a  stack  can  be  used  to 
show  the  order.  With  parallel  evaluation,  the  completion  order  may  be  arbitrary, 
so  an  indication  of  the  creating  process  is  required  to  preserve  the  sequence  as 
onginally  given.  Parallel  evaluation  of  conditionals  appears  generally  io  result  in 
unnecessary  work  for  the  sake  of  finding  the  desired  single  action  more  rapidly. 
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When  sufficient  conditionals  are  evaluated  to  uniquely  select  the  following  action 
process,  any  other  conditional  evaluation  can  be  suspended  (and  the  temporaries 
there  created  can  be  released). 

The  single  action  selection  is  appropriate  to  sequential  languages.  If  this  is  the 
case,  the  sequential  language  has  implied  false  conditions  as  inputs  *o  all  but  the 
first  occurring  condition  The  opportunity  to  sequentially  test  a  particular  con¬ 
ditional  only  occurs  when  all  prior  conditionals  are  false.  Parallel  evaluation  of 
conditionals  in  such  a  case  should  select  the  first  occurring  true  conditional  for 
action.  There  are  applications  where  multiple  actions  may  be  appropriate  and 
therefore  parallel  action  paths  may  be  executed.  The  algorithm  will  identify  these. 


DURATION  OF  DEFINITION  OF  AN  INSTANCE 

Creation  of  instances  of  registers  suggests  that  such  registers  are  defined  for  all 
time  after  creation.  In  practical  application,  these  instances  have  a  last  use  as 
process  input.  At  completion  of  this  last  use,  the  instance  is  of  no  further  use  and 
may  be  "unnamed",  which  serves  to  make  it  undefined  thereafter.  The  interval 
between  naming  at  creation  and  unnaming  after  last  use  is  the  duration  of  definition. 
This  study  has  not  been  primarily  concerned  with  questions  regarding  duration  of 
definition,  since  such  questions  are  more  properly  related  to  the  allocation  or 
mapping  of  names  into  memory  without  conflict.  In  order  to  exploit  the  duration  of 
necessary  definition  among  variables,  a  many-to-one  mapping  of  names  into  a 
memory  lection  on  a  non-overlapping  duration  basis  is  required. 

The  algorithm  for  determining  essential  process  ordering  uses  less  information  than 
is  required  for  determining  the  last  use  of  a  name.  For  example,  the  T  relation 
between  multi  input,  multi-output  processes  eliminates  a  number  of  process  com¬ 
parisons.  any  of  which  may  include  the  last  use  of  any  particular  variable.  Also, 
determination  of  any  one  name  in  the  intersection  of  output  and  input  sets  is  sufficient 
to  establish  essential  order  between  two  processes  without  completing  all  possible 
name  comparisons  in  these  sets.  Determining  the  last  use  of  a  name  requires 
checking  all  input  sets  of  processes  that  are  T  successors  of  the  creating  process 
for  occurrence  of  the  particular  created  name.  Some  reduction  in  the  amount  of 
checking  can  be  achieved  in  those  cases  where  the  language  provides  a  limit  on  the 
scope  of  a  variable  and  the  durations  are  extended  to  this  limit  even  if  last  use  is 
earlier  Other  reduction  may  occur  when  only  one  of  several  separate  instances 
ma>  be  referenced  by  an  execution,  in  which  case  the  originally  formulated  program 
mu  jt  have  included  conoitionals.  The  naming  could  be  the  same  for  aR  such 
mutually  exclusive  instances. 

r  ie  recognition  program  for  last  use  in  present  programming  languages  is  impeded 
b>  the  implicit  reuse  of  names  as  outputs  of  processes  with  actual  independent 
meaning  Uniquely  renaming  thes  names  having  multiple  meanings  as  proposed 
achieves  separate  registers  for  each  so  that  no  name  has  more  than  one  meaning. 

The  expense  of  doing  this  is  th  it  no  -egister  bromee  undefined  and  thus  no  register 
c'  be  reused  ir.  a  program. 
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A  comparison  in  the  programmer-given  name  space  of  Oj  H  Ojj  4  4>  is  an  indication 
of  multiple  use  of  names.  Each  such  name  in  the  original  order  of  processes 
determines  a  partition  across  the  set  of  processes  that  use  that  name  as  input  If 
a  last  use  of  a  variable  occurs  in  a  particular  statement,  the  use  should  be  early 
in  the  statement  evaluation  so  that  the  location  can  be  freed,  or  so  that  the 
variable  can  be  reassigned  by  a  parallel  path.  Other  variables  having  later  use 
may  be  postponed  in  the  particular  statement  evaluation.  A  suggested  algorithm 
may  be  to  minimize  the  duration  in  storage  for  any  variable,  since  the  duration  is 
loosely  related  to  the  freedom  to  parallel  process, 

Duration  of  instances  will  be  considered  in  more  detail  in  a  future  report,  where  the 
problem  of  identifying  the  instances  from  names  will  be  treated. 
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RELATED  INVESTIGATIONS 

The  algorithm  is  sequentially  formulated.  Application  of  the  algorithm  to  itself 
is  suggested  for  analysis  done  on  a  highly  parallel  machine.  Several  alternative 
ways  to  view  this  algorithm  are  suggested. 

Application  of  the  algorithm  to  the  syntactic  definition  of  context-free  languages 
to  classify  productions  of  the  language  that  might  be  done  in  parallel  is  described. 
A  method  of  determining  the  "essential  complexity"  of  a  programming  language 
is  suggested. 


PARALLEL  APPLICATION  OF  THE  ALGORITHM 

An  alternative  way  of  applying  the  algorithm  is  to  analyze  at  each  step  in  parallel 
all  previously  unanalyzed  S  relations  then  existing.  For  each,  the  result  may  be 
S,  in  which  case  T  is  extended;  or  the  result  may  be  unordered,  in  which  case  a 
new  set  of  unanalyzed  mS  relations  will  be  created.  This  new  set  of  S  relations, 
if  nonempty,  connect  processes  furth  r  apart  in  the  given  order.  If  empty,  then 
analysis  is  complete.  Thus,  for  N  total  processes,  there  are  no  more  than 
(N-l)  sets  of  comparisons  required  to  detect  all  instances  of  parallelism.  If  we 
are  given  a  linear  ordering  of  N  processes,  the  worst  case  is  N  parallel  processes. 
The  first  step  would  perform  (N-l)  input -output  comparisons  and  produce  (N-2) 
relations  in  rnS.  Each  of  these  relations  connect  processes  two  apart  in  the  ini.ial 
order  The  second  step  would  thus  have  (N-2)  comparisons,  and  so  on  for  (N-l) 
steps,  until  all  §N(N-1)  comparisons  are  made.  It  is  necessary  to  retain  the 
ability  to  link  between  any  two  disconnected  chains  of  S-linked  processes  (each 
chain  having  0  or  more  processes)  until  it  is  certain  that  there  is  no  connectirg 
S  relation  between  the  chains. 

It  is  possible  to  develop  T  initially,  and  from  T  develop  S.  If  all  ?N(N- 1) 
comparisons  are  conceded  as  being  required,  they  could  all  be  done  in  parallel. 

Any  nonempty  intersection  causes  an  entry  in  T.  Alternatively,  there  may  be  a 
significant  advantage  in  completing  in  parallel  all  comparisons  of  a  particular 
process  output  (or  output  set)  with  all  successor  input  sets.  When  done  for  all 
processes,  T  may  be  completed  by  forming  the  transitive  closure. 


PARALLELISM  IN  LANGUAGE  S\N  TACTIC  DEFINITION 

One  question  which  has  been  explored  is  the  applicability  of  the  algorithm  for  the 
detection  of  parallelism  to  the  syntactic  definition  of  certain  languages,  such  as 
ALGOL.  The  results  might  be  classification  of  the  productions  of  the  language 
which  can  be  applied  in  parallel,  so  that  the  syntax  recognition  part  of  the  com¬ 
pilation  process  itself  could  be  speeded  up  by  the  application  of  multiprocessing. 
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This  would  be  essentially  an  experimental  study,  which  tc  he  effectively  carried  out 
would  require  a  computer  program  for  the  algorithm  because  of  the  large  number 
of  syntactic  classes  to  be  considered.  It  appears  likely  that  empirical  separation 
of  the  syntactical  elements  of  a  compiler  language  would  achieve  results  not  much 
worse  than  those  achieved  by  application  of  the  algorithm. 

Theoretically,  it  might  be  useful  to  have  a  description  of  classes  of  productions 
which  can  be  performed  in  parallel.  Seme  insight  into  the  "essential  complexity" 
of  the  language  might  be  obtained  in  this  way. 

The  "essential  complexity"  of  the  syntax  of  most  programming  languages  may  be 
reduced  considerably  by  applying  an  algorithm  developed  by  Parikh.8  He  shows 
that  an;  context-free  language  can  be  replaced  by  an  ambiguity  preserving 
grammar,  with  all  productions  except  S  (the  initial  non-terminal)  in  the  form: 

A  -  a  A  b 

A  -  c 

where  A  is  any  specific  nonterminal  ^S,  and  a,  b,  and  c  are  strings  of  terminals 
with  both  a.  b  not  null. 

As  an  example  of  the  effect  of  application  of  this  replacement,  let  an  arithmetic 
statement  be  defined  by: 

A  :  (arithmetic  ftatement)  ::  ■  (variable)  *  (arithmetic  expression) 

A2:  (arithmetic  expression) ::  •  (term ) |  (arithmetic  expression)  (add  op)  (term) 
Ag:  (add  op) ::  ■+  |  - 

T:  ( term )::■( factor >| (term)  (mult  op)  (factor) 

M:  (mult  op>::  ■  *  |  / 

F:  (factor) ::  *  (integer)  I  (variable)  |  (arithmetic  expression) 

If  we  use  the  capital  abbreviations  preceding  tbs  nonterminal  syntactic  classes, 
the  non-terminal  vocabulary  V„  •  lAj,  At,  At,  M,  T,  F  }.  If  we  abbreviate 
integer  by  i  and  variable  by  v,  the  terminal  vocabulary  Vj  ■(  i,  v,  /  ]. 

In  our  language  syntax  subset,  the  initial  nonterminal  S  is  Aj.  In  effect,  Parikh 
asserts  that  the  only  other  nonterminals  required  sro  those  which  are  directly 
recursive.  In  what  follows,  we  show  that  the  given  definition  of  arithmetic 
statement,  involving  six  syntactic  classes  (A},  Aj,  Ag,  M,  T,  and  F),  can  be 
written  in  terms  of  only  two  syntactic  classes  (A  j,  T).  Of  course,  Aj  is  neces¬ 
sary  since  it  is  the  "sentence.  "  The  single  recursive  syntactic  class  is  not 
necessarily  T  (F  or  Ag  would  have  done  as  well).  Title  illustrates  the 
recognition  and  elimination  of  indirect  recursion. 

Table  I  shows  the  original  productions  rewritten  as  individual  productions,  and 
successive  replacements  yielding  in  the  right  column  the  redaction  to  two  syn¬ 
tactic  classes  Aj  and  T.  Productions  of  the  form  T-T  are  eliminated  as  redundant. 
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Table  I 

Reduction  of  Syntactic  Claeses 


Original 


A1  -  V“A2 
a2-t 


A2  -  A2A3T 
A3-  + 


A3-' 

f  P 


T  -  TMF 
M  -  * 


M  -  / 

F-  A, 
1 

F-  i 
F  -  v 


Replace  A2 
by  T 

Replace  F 
by  T 

A  -  v«T 

—  v*T 

T  -  TA,T 

T  -  TA  T 

A3  "* 

T  -  F 

A3  "* 

T  -  TMF 

T  -  TMT 

M  -  ♦ 

M  -  * 

M  -  / 

M  -  / 

F  -  T 

F-  i 

T-  1 

F  -  v 

T-v 

Replace  A3  by 
+  and  -  ;  M  by 
*  and  / 


A  j  -  v«T 


T  -  T+T 


T  -  T»T 


T-  T/T 


T  -  i 
T  -  v 


For  two  righthand  T's 
substitute  i  and  v  for 
one  T,  then  the  ether 


Ax  -  v*T 


T  -  i+T 
T  -  v+T 
T  -  T+i 
T  -  T+v 

T  -  i-T 
T  -  v-T 
T  -  T-i 
T  -  T-v 

T  -  i*T 
T  "  v*T 
T  -  T*i 
T  -  T*v 

T  -  i/T 
T  -  v/T 
T-  T/i 
T  -  T/v 

T  -  i 
T-v 


Application  of  the  algorithm  for  the  detection  of  parallelism  to  the  syntactic 
definition  of  a  programming  language  requires  that  the  order  of  application  of  the 
productions  be  specified.  The  order  chosen  by  the  programmer  in  the  syntax 
recognition  portion  of  a  compiler  for  a  particular  language  presumably  provides 
more  nearly  efficient  recognition  of  the  more  likely  strings.  Therefore,  some 
value  judgment  is  required  in  selecting  this  order  which  is  unrelated  to  the 
actual  syntax.  Experimental  determination  of  the  relative  occurrence  over  a 
large  set  of  programs  of  the  various  strings  allowed  by  a  language  could  be  used 
as  an  aid  in  finding  improved  ordering  for  productions.  Following  all  possible 
production  paths  will  obviously  recognize  the  string  if  correct,  at  the  expense  of 
following  mostly  incorrect  production  paths.  A  compromise  in  the  number  of 
levels  in  the  syntax  that  matches  the  number  of  parallel  paths  tha*  can  be  con¬ 
currently  followed  may  be  desirable.  Introduction  of  other  equivalent  sets  of 
productions  tin  other  oraers)  may  be  done  without  reducing  the  power  of  the 
language.  However,  as  our  example  indicates,  such  introduction  may  increase  the 
number  of  productions  very  rapidly  and  complicate  the  recognition  procedure  so 
much  that  application  of  Parikh's  result  would  be  necessary  to  practicably  reduce 
the  number  of  syntactic  classes  to  a  minimum. 
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PROGRAM  FOR  THE  NEXT  INTERVAL 


1.  Develop  techniques  for  recognizing  instances  of  registers  given  the  names 
in  a  program. 

2.  Continue  to  investigate  formal  program  structures  with  emphasis  on  arrays. 

3.  Continue  to  identify  the  effect  of  language  features  on  parallelism. 

4.  Examire  criteria  for  partitioning  a  program  into  processes. 

3.  Discuss  imolementation  of  parallel  analysis. 

6.  Describe  a  language  for  simulating  the  essential  order  detection  given  the 
instances. 
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